Brandt's GLR method & refined HMM segmentation for TTS synthesis application

نویسندگان

  • Safaa Jarifi
  • Dominique Pastor
  • Olivier Rosec
چکیده

In comparison with standard HMM (Hidden Markov Model) with forced alignment, this paper discusses two automatic segmentation algorithms from different points of view: the probabilities of insertion and omission, and the accuracy. The first algorithm, hereafter named the refined HMM algorithm, aims at refining the segmentation performed by standard HMM via a GMM (Gaussian Mixture Model) of each boundary. The second is the Brandt’s GLR (Generalized Likelihood Ratio) method. Its goal is to detect signal discontinuities. Provided that the sequence of speech units is known, the experimental results presented in this paper suggest in combining the refined HMM algorithm with Brandt’s GLR method and other algorithms adapted to the detection of boundaries between known acoustic classes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A fusion approach for automatic speech segmentation of large corpora with application to speech synthesis

This paper deals with the automatic segmentation of large speech corpora in the case when the phonetic sequence corresponding to the speech signal is known. A direct and typical application is corpus-based Text-To-Speech (TTS) synthesis. We start by proposing a general approach for combining several segmentations produced by different algorithms. Then, we describe and analyse three automatic se...

متن کامل

Deep Learning Techniques in Tandem with Signal Processing Cues for Phonetic Segmentation for Text to Speech Synthesis in Indian Languages

Automatic detection of phoneme boundaries is an important sub-task in building speech processing applications, especially text-to-speech synthesis (TTS) systems. The main drawback of the Gaussian mixture model hidden Markov model (GMMHMM) based forced-alignment is that the phoneme boundaries are not explicitly modeled. In an earlier work, we had proposed the use of signal processing cues in tan...

متن کامل

An HMM trajectory tiling (HTT) approach to high quality TTS

We propose an HMM Trajectory Tiling (HTT) approach to high quality TTS, which is our entry to Blizzard Challenge 2010. In HTT, first refined HMM is trained with the Minimum Generation Error (MGE) criterion; then trajectory generated by the refined HMM is to guide the search for finding the closest waveform segment “tiles” in synthesis. Normalized distances between HMM trajectory and those of th...

متن کامل

Decision Tree Classification Approach for Model Selection in Segmenting Mandarin TTS Corpus

High accuracy automatic segmentation of Mandarin TTS (text to speech) corpus is vital for obtaining high quality syllable’s boundary to corpusbased speech synthesis. Among the existing methods, most studies on automatic segmentation are based upon single model, ignoring the diverse time marks gained by different models in specific Mandarin boundary environment. In this paper, three hidden Marko...

متن کامل

An Improved Automatic EEG Signal Segmentation Method based on Generalized Likelihood Ratio

It is often needed to label electroencephalogram (EEG) signals by segments of similar characteristics that are particularly meaningful to clinicians and for assessment by neurophysiologists. Within each segment, the signals are considered statistically stationary, usually with similar characteristics such as amplitude and/or frequency. In order to detect the segments boundaries of a signal, we ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005